home *** CD-ROM | disk | FTP | other *** search
- Translation system for Linuxconf
- Introduction
-
- Linuxconf is a large software component, full of menus, and dialogs.
- To be easily translatable, all messages must be extracted from the C++
- source code and place into dictionnaries which can be translated effi-
- ciently. A special set of tools has been designed to achieve this.
- They are described here.
-
- 1. Introduction
-
- This document describes both how the system works and how translators
- can use it. It starts by explaining how programmers can use it to
- produce translatable programs. The section "how to translate" explains
- how translators can use this system to translate linuxconf or any
- programs written using this system.
-
- 2. Principles
-
- To make programs easily translatable, all messages should be placed in
- dictionnaries. A dictionnary is made of message entries. Each message
- has a unique ID and a value. In the C++ source, programmers are
- refering to those messages using the ID whenever they want to print or
- say something.
-
- Each time a programmer need a new message, he has to add it in the
- message dictionnary and reference it from the C++ source code. This is
- how most system works (There are other translation system out there).
-
- The system used by Linuxconf is basically different. Messages are
- defined in the C++ source code and the dictionnaries are built by
- scanning all C++ source files. Messages are defined in the C++ code.
- Programmers must provide and ID and a value for each message right in
- the source code. This is much easier (or nicer) to do this right in
- the source code than to go back and forth in the dictionnary.
- Furthermore, the programmer directly see the message definition in the
- source. With other system, only the message ID is visible in the
- source.
-
- Using the magic of the C preprocessor, the message value is not
- compiled in the object code at all. Seen this way, the translation
- system used by Linuxconf yield the same result as other system. It is
- just nicer to use for programmers.
-
- Lets describe how a programmer use the system.
-
- 2.1. One dictionnary per source directory
-
- It is best to define one message dictionnary per sub-project or sub-
- directory. This is easier to manage and avoid ID name space
- congestion. For each directory source of Linuxconf you have one "dic"
- file and one "m" file. Both file are produced simply by doing
-
- make msg
-
- This command scans all C++ source file of the current directory and
- update the file ../messages/sources/DIRECTORY.dic and the file
- DIRECTORY.m, where DIRECTORY is the name of the current directory.
-
- make msg use the ../translate/msgscan utility to scan the source. This
- utility looks for specific constructs in the C++ source file. Here
- they are.
-
- 2.2. The MSG_U macro
-
- The MSG_U macro defines a new message. It defines both its ID and its
- value. This macro is usable anywhere a C++ string would be.
-
- #include "prjfoo.m"
-
- int foo()
- {
- printf (MSG_U(M_MSG1,"Entering function foo"));
- }
-
- MSG_U defines a single value. U stands for unilingual. It only defines
- one value.
-
- 2.3. The MSG_B macro
-
- The MSG_B macro is like the MSG_U macro, except it defines two values,
- allowing a programmer to code immediatly two languages at once. The B
- stands for bilingual. This has not been used in the Linuxconf project
- but has proven effective for other projects.
-
- #include "prjfoo.m"
-
- int foo()
- {
- printf (MSG_U(M_MSG1
- ,"Entering function foo\n"));
- ,"DΘmarrage de la fonction foo\n"));
- }
-
- 2.4. The MSG_R macro
-
- The MSG_R macro simply references an already defined message. This
- message may have been defined in another source file (of the same
- project). Like the other macros, MSG_R may be used anywhere a C++
- string is.
-
- 2.5. The MSG_VERSION macro
-
- This macro has not been used so far. It would allow one programmer to
- raise the version number of a dictionnary, preventing older
- application to use the newer potentially incompatible dictionnary.
-
- The msgclean utility also plays with the version number of the
- dictionnary. The MSG_VERSION macro is still a concept rather than a
- useful addition. Stay tune...
-
- 2.6. The magic of the MSG_ macros
-
- The MSG_ macros perform two tasks. First, they are easily spotted by
- the msgscan utility. The parsing is simple and reliable even if the
- C++ source code is not functionnal. Second, they hide the retrieval
- mecanism (How the message value is retrieved from the binary
- dictionnary at runtime).
-
- The msgscan utility produce the .m file which looks like this for the
- simple example above.
-
- FILE prjfoo.m:
-
- extern const char **_dictionnary_prjfoo;
- #ifndef DICTIONNARY_REQUEST
- #define DICTIONNARY_REQUEST \
- const char **_dictionnary_prjfoo;\
- TRANSLATE_SYSTEM_REQ _dictionnary_req_prjfoo\
- ("prjfoo",_dictionnary_prjfoo,55,1);\
- void dummy_dict_prjfoo(){}
- #endif
- #ifndef MSG_U
- #define MSG_U(id,m) id
- #define MSG_B(id,m,n) id
- #define MSG_R(id) id
- #endif
- #define M_MSG1 _dictionnary_prjfoo[0]
-
- As you see, one global variable is created: _dictionnary_prjfoo. A
- special macro DICTIONNARY_REQUEST is defined. This macro should be
- placed in one source of the project. It is generally place in the file
- _dict.c presented later.
-
- 3. How to use it
-
- To produce a translatable program, do the following
-
- ╖ Replace all string message with MSG_U or MSG_B macros, giving each
- message a unique ID.
-
- ╖ include (#include) the .m file in each source file using the MSG_x
- macros. This file is generally named directory.m where directory is
- the name of the current directory.
-
- ╖ Create a file _dict.c. The content of this file is shown below.
-
- ╖ Use "make msg" to extract the messages. This produces/updates the
- dictionnary file directory.dic and produces the include file
- directory.m.
-
- ╖ Compile and link your program.
-
- ╖ Use "make msg.eng" to produce the english binary dictionnary. The
- file produced should be placed where your program expects it.
-
- We will now describe further the different steps involved.
-
- 3.1. The make msg command and msgscan utility
-
- The make msg command invokes the msgscan utility. This utility scan a
- set of C or C++ source file, updates a dictionnary file and produces
- one include file.
-
- Here is the command use to update the dictionnary of the sub-project
- uucp of the Linuxconf project.
-
- ../translate/msgscan uucp \
- ../messages/sources/uucp.dic uucp.m EF *.c
-
- The first argument is the name of the dictionnary. The second argument
- is the path of the dictionnary file. As you see, dictionnary file are
- kept in a single directory for all projects. They are seldom. This
- eases the works of translators. The third argument is the path of the
- include file, which is produced in the current directory.
-
- The fourth argument is the letter tags used to identify messages
- defined with the macro MSG_U and MSG_B. Messages defined with MSG_U
- will be tagged with the letter E (English) and messages defined with
- MSG_B will be tagged with E for the first value and F (French) for the
- second.
-
- 3.2. The _dict.c file
-
- It is good pratice to place the DICTIONNARY_REQUEST macro in a file
- _dict.c. There is generally one such a file per directory. Its
- contents is generally:
-
- #include "this_directory.m"
- #include <translat.h>
- DICTIONNARY_REQUEST
-
- At least this dependancy should be placed in your makefile
-
- _dict.o: _dict.c this_directory.m
-
- This will ensure that each time you update your dictionnary (and the m
- header file), _dict.c will be recompile, ensuring proper recording of
- the dictionnary revision and number of message. This will avoid
- executing a program with an obsolete or incompatible binary
- dictionnary.
-
- Given that _dict.c is small, the compilation is pretty short.
-
- 3.3. The msgcomp utility
-
- Once you have compiled and linked your program, you must "compiled"
- all the dictionnaries used in your program into one binary
- dictionnary. This is done by the msgcomp utility. Here is the command
- used when doing "make msg.eng" for the Linuxconf project. This
- produces the english binary dictionnary.
-
- ../translate/msgcomp -p../messages/sources/ \
- /tmp/linuxconf-msg-1.3.eng eE \
- askrunlevel dialog dnsconf fstab \
- misc main netconf mailconf uucp userconf
-
- This commands take all dictionnaries for sub-projects askrunlevel
- dialog dnsconf fstab misc main netconf mailconf uucp and userconf and
- produce a single binary dictionnary.
-
- The -p option tells msgcomp to look for those dic files (
- askrunlevel.dic dialog.dic ...) in the directory
- ../messages/sources/.
-
- The argument /tmp/linuxconf-msg-1.3.eng is the file to produce. The
- argument eE instructs msgcomp to extract message'values with the 'e'
- tag. If there is no such value for a given message, the value with the
- 'E' tag will be used.
-
- 3.3.1. Convention used for tags
-
- Dictionnary file contain the definition for all messages. Each
- messages may have different values, identified by a tag letter. When
- messages are extracted by msgscan, it is instructed to associate
- values with given tags. By convention, we use upper case letter to
- identify message's value extracted from the source code. Lower case
- value are used by translators.
-
- We assume here that programmers are bad writters. We let them give
- their best shots for messages and we are allowed to override their
- work without overwriting it. By giving precedence to 'e' tags over 'E'
- we are saying that translators work override the work of programmers,
- but we are not forcing the translators to rewrite everything.
-
- 3.4. The msgclean utility
-
- The msgscan utility maintains dictionnary. At some point some messages
- may become obsolete (Unused in any source files). The msgclean is used
- to clean messages without values in the dic file.
-
- For the Linuxconf project, the make target msg.clean is defined for
- that purpose.
-
- Be aware that applying msgclean on a dictionnary file with obsolete
- message has an important side effect. Some message being deleted, the
- numbering of all following message will be changed. All source using
- the m include file should be recompiled.
-
- To avoid problems, the msgclean utility automaticly increases the
- revision number of the dictionnary. This prevents using a dictionnary
- with an incompatible program.
-
- 4. Usage restriction
-
- The stategy used is mainly targetted at C++ code. With some
- restriction, it may be used for C code. Here are the main feature that
- probably don't work with C.
-
- static initialisation
- In C++ one can write the following code.
-
- static char *tb[]={
- foo(1),foo(22)
- };
-
- where foo is a function. The C++ compiler will generate the proper
- code which will be probably called once. The MSG_U macro (and
- others) are not hiding function call, but are indeed dynamic in
- some sens. C does not support this. Other translation strategy
- based on dictionnary do have the same limitation though.
-
- The exemple using the static char *tb[] is also causing a problem in
- C++ if the variable is declared outside of a function. The problem
- appear because the "hidden" initialisation code generated by the
- compiler is called very early, often before main() is called.
- Normally, the function translat_load() which bring the dictionnary in
- memory is called by main().
-
- Fortunatly, the current implementation, where _dictionnary_system is a
- pointer will trigger a seg fault whenever this condition is met. This
- fault will be trigger all the time, because all initialisation are
- called before main. The strategy is safe.
-
- 5. Recommend usage and convention
-
- 5.1. Naming convention for message's ID
-
- To help peoples who will translat your Linuxconf, I have used a
- convention for the ID's name.
-
- B_ Buttons.
-
- E_ Error message start with this.
-
- F_ Field labels start with this.
-
- I_ Dialog instroduction start with this.
-
- M_ All menu entries start with this prefix.
-
- N_ Notices and warning start with this.
-
- P_ When the user is prompted for a password, the message's ID start
- with this.
-
- Q_ Identify a question (Generally a Yes/No prompt).
-
- T_ Dialog's title start with this.
-
- X_ All other messages which fit in no category.
-
- 6. How to translate
-
- 6.1. Go simple
-
- One way to translate is to go right in the .dic files and add
- translations for each message using a different tag. Then use the
- msgcomp utility to extract the proper definition.
-
- At first, there is little problem doing this. The msgscan utility
- read,update and save the .dic file, so your changes won't be lost.
-
- The problem come from the way software is developped. First we develop
- and then, when it is stable, we translate. Doing so mean that we have
- to walk all the .dic files to make sure our translation still fit with
- the original messages (English version for example). Those original
- messages may have changed.
-
- A different scheme was choosen for Linuxconf.
-
- 6.2. Organisation of the messages directory
-
- The messages directory contain one subdirectory per language plus one
- sources directory. This directory contains all the These file are
- never hand edited.
-
- Each other directory has a copy of those .dic files with the proper
- translation. A special utility msgupd has been created: it basicly
- compared all messages in the sources directory with messages in the
- translated directory. It compare only one language (say the english
- version).
-
- Mostly, msgupd will tell you
-
- ╖ Which messages are new.
-
- ╖ Which messages have changed (The english wording).
-
- Using that information, you know exactly what you have to do to keep
- your work in sync with the current release of Linuxconf. msgupd will
- reorder the translated .dic file (Not the one in the sources
- directory) so all messages which needed work are at the beginning of
- the file. It also add a comment (.dic files may have comments like
- most normal Unix configuration file) explaining what have to be done.
-
- If the english version of the message was changed, it will retag the
- version in the translated file and add the new version, plus a
- comment. The old english message will have the tag "Z". You can see
- easily what is the change.
-
- 6.3. The msgupd utility
-
- The file rules.mak shows the rules for one translation (which is not
- done yet). Look for the target msg.cfr and upd.cfr. To add a new
- language, do this
-
- ╖ Create a new directory empty in the messages directory, for
- example, mar for Alien language.
-
- ╖ Customise rules.mak and add the target msg.mar and upd.mar.
-
- ╖ Run the following command. This will fill the messages/mar
- directory with all the necessary .dic files.
-
- make upd.mar
-
- ╖ Go into messages/mar and edit each .dic file and add the proper
- translation as needed.
-
- ╖ Run the following command to produce the binary dictionnary
- required to run Linuxconf.
-
- make msg.mar
-
- ╖ Set the following environnement variable and run Linuxconf.
-
- ╖ export LINUXCONF_LANG=mar
-
- ╖ export LINUXCONF_DICT=/tmp
-
- This variable is optionnal. Linuxconf will normally look for its
- message dictionnary in /usr/lib/linuxconf. This variable override
- this. The msg.* makefile's target generally produce their output
- in /tmp. This is useful to test new messages without breaking the
- current installation of Linuxconf.
-
- Be aware that this mecanism only work if you execute Linuxconf as
- root. For security reason, a normal user can't override the message
- dictionnary of Linuxconf (Although he can select a different
- language from /usr/lib/linuxconf if available).
-
- 6.4. The msgcomp utility
-
- The msgcomp utility has been tweaked to support the distribute
- directory concept. Mainly it use the .dic file in the sources
- directory as a reference. Message number ID are defined from this
- file. It then used (optionnally) alternative
-
- 7. Licensing
-
- The translate directory is part of the Linuxconf project but carry a
- special license. There is no resctriction on usage. Feel free to
- incorporate this system to any project.
-
- This simple license does not apply to the rest of Linuxconf which is
- covered by the standard GNU Copyleft license. See the file LICENSE in
- the root directory.
-
- If you find it useful for other project, send me a note and some
- comments if possible.
-
-